Large-Margin Multi-Label Causal Feature Learning
نویسندگان
چکیده
In multi-label learning, an example is represented by a descriptive feature associated with several labels. Simply considering labels as independent or correlated is crude; it would be beneficial to define and exploit the causality between multiple labels. For example, an image label ‘lake’ implies the label ‘water’, but not vice versa. Since the original features are a disorderly mixture of the properties originating from different labels, it is intuitive to factorize these raw features to clearly represent each individual label and its causality relationship. Following the large-margin principle, we propose an effective approach to discover the causal features of multiple labels, thus revealing the causality between labels from the perspective of feature. We show theoretically that the proposed approach is a tight approximation of the empirical multi-label classification error, and the causality revealed strengthens the consistency of the algorithm. Extensive experimentations using synthetic and real-world data demonstrate that the proposed algorithm effectively discovers label causality, generates causal features, and improves multi-label learning. Introduction In the conventional single-label learning scenario, an example is associated with a single label that characterizes its property. In many real-world applications, however, an example will naturally have several class labels. For example, an image of a natural scene can simultaneously be annotated with ‘sky’, ‘mountains’, ‘trees’, ‘lakes’, and ‘water’. Multilabel learning (Luo et al. 2013b; 2013a; Xu, Yu-Feng, and Zhi-Hua 2013; Bi and Kwok 2014; Doppa et al. 2014) has emerged as a new and increasingly important research topic which has the capacity to handle such tasks. The most straightforward solution to multi-label learning is to decompose the problem into a series of binary classification problems, one for each label (Boutell et al. 2004). However, this solution is limited because it neglects to take the relationships between labels into account. Learning multiple labels simultaneously has been empirically shown to significantly improve performance relative to independent label learning, especially when there are insufficient training examples for some labels. It is therefore advantageous to exploit the relationships between labels for learning, and some Copyright c © 2015, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. works have already used this strategy. For example, (Cai and Hofmann 2004; Cesa-Bianchi, Gentile, and Zaniboni 2006; Rousu et al. 2005; Hariharan et al. 2010; Bi and Kwok 2011) utilize external knowledge to derive the label relationships, such as knowledge of label hierarchies and label correlation matrices. Since these external knowledge resources are often unavailable for real-world applications, other studies (Sun, Ji, and Ye 2008; Tsoumakas et al. 2009; Petterson and Caetano 2011) have attempted to exploit label relationships by counting the co-occurrence of labels in the training data. In practice, the label relationship is asymmetric rather than symmetric, as assumed by most of the existing multilabel learning algorithms. For example, an image labeled ‘lake’ implies a label ‘water’, but the inverse is not true. Only a small number of works have tried to exploit this asymmetric label relationship. For example, in (Zhang and Zhang 2010), a Bayesian network was used to characterize the dependence structure between multiple labels, and a binary classifier was learned for each label by treating its parental labels in the dependence structure as additional input features. (Huang, Yu, and Zhou 2012) assumed that if two labels are related, the hypothesis generated for one label can be helpful for the other label, and implemented this idea as a boosting approach with a hypothesis reuse mechanism. The feature of each example in multi-label learning determines the appearance of its labels, thus the feature itself can be seen as a disorderly mixture of the properties originating from diverse labels. To comprehensively understand the asymmetric label relationship, we define the relationship as causality, and propose to reveal the causality from the perspective of feature. Moreover, there is a demand for the theoretical results to guarantee that exploiting causality is actually beneficial for multi-label learning. In this paper, we intend to transform the original features of examples shared by different labels into causal features corresponding to each individual label. Following the large-margin principle, we propose a new algorithm termed Large-margin Multi-label Causal Feature learning (LMCF) to achieve this aim. The discovered causal features will reveal causality between labels from the perspective of feature. Geometrically, the causality is encoded by the ‘margins’ corresponding to different labels on the hyperplane of causal features. By encouraging the margins to be large while satProceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence
منابع مشابه
MLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection
Multi-label classification has gained significant attention during recent years, due to the increasing number of modern applications associated with multi-label data. Despite its short life, different approaches have been presented to solve the task of multi-label classification. LIFT is a multi-label classifier which utilizes a new strategy to multi-label learning by leveraging label-specific ...
متن کاملSemi-supervised Multi-label Classification - A Simultaneous Large-Margin, Subspace Learning Approach
Labeled data is often sparse in common learning scenarios, either because it is too time consuming or too expensive to obtain, while unlabeled data is almost always plentiful. This asymmetry is exacerbated in multi-label learning, where the labeling process is more complex than in the single label case. Although it is important to consider semisupervised methods for multi-label learning, as it ...
متن کاملMulti-Kernel Multi-Label Learning with Max-Margin Concept Network
In this paper, a novel method is developed for enabling Multi-Kernel Multi-Label Learning. Interlabel dependency and similarity diversity are simultaneously leveraged in the proposed method. A concept network is constructed to capture the inter-label correlations for classifier training. Maximal margin approach is used to effectively formulate the feature-label associations and the labellabel c...
متن کاملLarge Margin Metric Learning for Multi-Label Prediction
Canonical correlation analysis (CCA) and maximum margin output coding (MMOC) methods have shown promising results for multi-label prediction, where each instance is associated with multiple labels. However, these methods require an expensive decoding procedure to recover the multiple labels of each testing instance. The testing complexity becomes unacceptable when there are many labels. To avoi...
متن کاملMulti-Instance Multi-Label Learning with Weak Label
Multi-Instance Multi-Label learning (MIML) deals with data objects that are represented by a bag of instances and associated with a set of class labels simultaneously. Previous studies typically assume that for every training example, all positive labels are tagged whereas the untagged labels are all negative. In many real applications such as image annotation, however, the learning problem oft...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015